Rank in Wordlist | Frequency | Word |
---|---|---|
4503 | 5 | 10,000 |
5348 | 4 | 1,500 |
5358 | 4 | 2,000 |
5365 | 4 | 5,000 |
5366 | 4 | 50,000 |
6605 | 3 | 1,000 |
6610 | 3 | 100,000 |
6624 | 3 | 20,000 |
6632 | 3 | 30,000 |
6639 | 3 | 500,000 |
Rank in Wordlist | Frequency | Word |
---|---|---|
2101 | 12 | 100% |
3909 | 6 | 5% |
3910 | 6 | 50% |
4504 | 5 | 15% |
4509 | 5 | 20% |
4517 | 5 | 70% |
5349 | 4 | 10% |
5362 | 4 | 30% |
6601 | 3 | 0.1% |
6604 | 3 | 1% |
Rank in Wordlist | Frequency | Word |
---|---|---|
7079 | 3 | R&D |
8984 | 2 | B&B |
10019 | 2 | S&P |
14417 | 1 | A&M |
14418 | 1 | A&O |
14502 | 1 | ARK-V&S |
14526 | 1 | AT&T's |
16932 | 1 | F&N |
18095 | 1 | J&J |
18552 | 1 | LS&Co’s |
Rank in Wordlist | Frequency | Word |
---|---|---|
22128 | 1 | US$0.9 |
22129 | 1 | US$126bil |
22130 | 1 | US$200 |
22131 | 1 | US$400 |
22132 | 1 | US$5.9bil |
25956 | 1 | etc.$gIreland,$d1985 |
Rank in Wordlist | Frequency | Word |
---|---|---|
153 | 107 | ." |
Rank in Wordlist | Frequency | Word |
---|---|---|
522 | 42 | don't |
663 | 35 | it's |
1124 | 22 | I'm |
1126 | 22 | It's |
1549 | 17 | you're |
1824 | 14 | can't |
1840 | 14 | doesn't |
2095 | 13 | world's |
2099 | 13 | you'll |
2219 | 12 | isn't |
Rank in Wordlist | Frequency | Word |
---|---|---|
13719 | 1 | 2+1 |
14149 | 1 | 5.7+IV |
14150 | 1 | 5.7+IV::l::Union |
16266 | 1 | DA+C |
16326 | 1 | DVD+1 |
19882 | 1 | PROGRAM+66 |
20450 | 1 | RAS-E13CJT+PP23S4MTH |
20481 | 1 | RS815+/RS815RP |
24618 | 1 | command+shift+4 |
32928 | 1 | user+extension |
Rank in Wordlist | Frequency | Word |
---|---|---|
2020 | 13 | https://www |
2319 | 11 | I/O |
2346 | 11 | and/or |
2761 | 9 | GNU/GPL |
3262 | 8 | http://cmuir |
4987 | 5 | he/she |
5003 | 5 | http://kb2tmp |
5004 | 5 | http://libsearch |
6106 | 4 | http://www |
6595 | 4 | www.dtac.co.th/dtacreward |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots